NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Qualitative Coding with GPT-4: Where it Works Better

https://doi.org/10.18608/jla.2025.8575

Liu, Xiner; Zambrano, Andres Felipe; Baker, Ryan S; Barany, Amanda; Ocumpaugh, Jaclyn; Zhang, Jiayi; Pankiewicz, Maciej; Nasiar, Nidhi; Wei, Zhanlan (March 2025, Journal of Learning Analytics)

This study explores the potential of the large language model GPT-4 as an automated tool for qualitative data analysis by educational researchers, exploring which techniques are most successful for different types of constructs. Specifically, we assess three different prompt engineering strategies — Zero-shot, Few-shot, and Few-shot with contextual information — as well as the use of embeddings. We do so in the context of qualitatively coding three distinct educational datasets: Algebra I semi-personalized tutoring session transcripts, student observations in a game-based learning environment, and debugging behaviours in an introductory programming course. We evaluated the performance of each approach based on its inter-rater agreement with human coders and explored how different methods vary in effectiveness depending on a construct’s degree of clarity, concreteness, objectivity, granularity, and specificity. Our findings suggest that while GPT-4 can code a broad range of constructs, no single method consistently outperforms the others, and the selection of a particular method should be tailored to the specific properties of the construct and context being analyzed. We also found that GPT-4 has the most difficulty with the same constructs than human coders find more difficult to reach inter-rater reliability on.
more » « less
Free, publicly-accessible full text available March 27, 2026
Investigating Algorithmic Bias on Bayesian Knowledge Tracing and Carelessness Detectors

https://doi.org/10.1145/3636555.3636890

Zambrano, Andres Felipe; Zhang, Jiayi; Baker, Ryan S (March 2024, ACM)

Full Text Available
No Benefit for High-Dosage Time Management Interventions in Online Courses

https://doi.org/10.1145/3573051.3596176

Zhang, Jiayi; Baker, Ryan S.; Farmer, Thomas (July 2023, L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale)

In past work, time management interventions involving prompts, alerts, and planning tools have successfully nudged students in online courses, leading to higher engagement and improved performance. However, few studies have investigated the effectiveness of these interventions over time, understanding if the effectiveness maintains or changes based on dosage (i.e., how often an intervention is provided). In the current study, we conducted a randomized controlled trial to test if the effect of a time management intervention changes over repeated use. Students at an online computer science course were randomly assigned to receive interventions based on two schedules (i.e., high-dosage vs. low-dosage). We ran a two-way mixed ANOVA, comparing students' assignment start time and performance across several weeks. Unexpectedly, we did not find a significant main effect from the use of the intervention, nor was there an interaction effect between the use of the intervention and week of the course.
more » « less
Full Text Available
Says Who? How different ground truth measures of emotion impact student affective modeling

https://doi.org/10.5281/zenodo.12729799

Zambrano, Andres Felipe; Nasiar, Nidhi; Ocumpaugh, Jaclyn; Goslen, Alex; Zhang, Jiayi; Rowe, Jonathan; Esiason, Jordan; V, Jessica; enberg; Hutt, Stephen (January 2024, International Educational Data Mining Society)
Benjamin, Paaßen; Carrie, Demmans Epp (Ed.)
Research into student affect detection has historically relied on ground truth measures of emotion that utilize one of three sources of data: (1) self-report data, (2) classroom observations, or (3) sensor data that is retrospectively labeled. Although a few studies have compared sensor- and observation-based ap-proaches to student affective modeling, less work has explored the relationship between self-report and classroom observa-tions. In this study, we use both recurring self-reports (SR) and classroom observation (BROMP) to measure student emotion during a study involving middle school students interacting with a game-based learning environment for microbiology educa-tion. We use supervised machine learning to develop two sets of affect detectors corresponding to SR and BROMP-based measures of student emotion, respectively. We compare the two sets of detectors in terms of their most relevant features, as well as correlations of their output with measures of student learning and interest. Results show that highly predictive features in the SR detectors are different from those selected for BROMP-based detectors. The associations with interest and motivation measures show that while SR detectors captured underlying motivations, the BROMP detectors seemed to capture more in-the-moment information about the student申fs experience. Evi-dence suggests that there is benefit of using both sources of data to model different components of student affect.
more » « less
Full Text Available
Prospective Multiple Antenna Technologies for Beyond 5G

https://doi.org/10.1109/JSAC.2020.3000826

Zhang, Jiayi; Bjornson, Emil; Matthaiou, Michail; Ng, Derrick Wing; Yang, Hong; Love, David J. (August 2020, IEEE Journal on Selected Areas in Communications)
null (Ed.)
Full Text Available

Search for: All records